Robust Multiple Resolution an Speech Recogn
نویسندگان
چکیده
This paper describes the use of denoising techniques in the time domain applied to the outputs of filters corresponding to a Multi Resolution Analysis. The fact that energies of denoised samples are used for Automatic Speech Recognition (ASR) makes soft thresholding particularly attractive especially if Principal Component Analysis (PCA) is applied to the whole tree of energy features. This consideration is supported by experimental results on a very large test set including many speakers uttering proper names from different locations of the Italian public telephone network. The results show that soft thresholding outperforms J-Rasta PLP with a WER reduction, after denoising, of 26%.
منابع مشابه
Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants
Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملRobust Fuzzy Content Based Regularization Technique in Super Resolution Imaging
Super-resolution (SR) aims to overcome the ill-posed conditions of image acquisition. SR facilitates scene recognition from low-resolution image(s). Generally assumes that high and low resolution images share similar intrinsic geometries. Various approaches have tried to aggregate the informative details of multiple low-resolution images into a high-resolution one. In this paper, we present a n...
متن کاملThe FOPHO Speech Recognition Project
The FOPHO (F_oreign Phonet ic ian) speech recogn i t i on p ro jec t concerns the development of a system to produce a reasonably high q u a l i t y phonetic t r a n s c r i p t i o n output from continuous speech i npu t . The system is developed to perform in a way which approximates the act ions of a phonet ic ian t r y i n g to t ranscr ibe a fo re ign tongue, ( i n the case of FOPHO, Aus t...
متن کاملA Review on Speech Recognition with Filters
Speech recognition is a popular topic in today’s life because of its numerous applications. For example, consider the applications in the mobile phone in which instead of typing the name of the person who user want to call, the user can just directly speak the name of person to the mobile phone and the mobile phone will automatically call that person. The Speech is most prominent & primary mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002